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Abstract 

Defensive forecasting is a method of transforming laws of probability 
(stated in game-theoretic terms as strategies for Sceptic) into forecast- 
ing algorithms. There are two known varieties of defensive forecasting: 
"continuous", in which Sceptic's moves are assumed to depend on the 
forecasts in a (semi)continuous manner and which produces deterministic 
forecasts, and "randomized" , in which the dependence of Sceptic's moves 
on the forecasts is arbitrary and Forecaster's moves are allowed to be ran- 
domized. This note shows that the randomized variety can be obtained 
from the continuous variety by smearing Sceptic's moves to make them 
continuous. 

New as compared to version 1 (17 August 2007) of this re- 
port: The assumption of version 1 that the outcome space Q is finite is 
relaxed, and now it is only assumed to be compact. In the case where Q 
is finite, it is shown that Forecaster can choose his randomized forecasts 
concentrated on a finite set of cardinality at most 



1 Introduction 

The continuous variety of defensive forecasting was essentially introduced by 
Levin [5] , but was later rediscovered by Kakade and Foster [7J and Takemura et 
al. PI]. 

The randomized variety was introduced (in the case of von Mises's version of 
the game-theoretic approach to probability) by Foster and Vohra [5] and further 
developed by, among others, Sandroni et al. these papers, however, were 
only concerned with asymptotic calibration. Non-asymptotic versions of the 
randomized variety were proposed by Sandroni [10! (based on standard measure- 
theoretic probability) and Vovk and Shafer [15] (based on game-theoretic prob- 
ability). Kakade and Foster [7J noticed that some calibration results require 
very little randomization (this will be an important aspect of our Theorem [2]). 
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This note states two simple results about defensive forecasting, Theorem 
[T] about the continuous variety and Theorem [2] about the randomized variety. 
The proof of Theorem [2] is obtained from the proof of Theorem [1] by blurring 
Sceptic's moves. 

In our informal discussions we will be assuming that the set of all possible 
outcomes is finite, although we will try to make mathematical statements as 
general as possible. The reader who is only interested in the main ideas might 
choose to specialize Theorems [T] and [2] and their proofs to the case of finite fl. 

2 Continuous defensive forecasting 

Let fl (the outcome space) be a compact (i.e., a compact Hausdorff topological 
space) equipped with the Baire tr-algebra and P(O) be the set of all probabil- 
ity measures on f2 equipped with the standard topology (the weak* topology 
on ■P(fi) identified with a subset of C(fl)' by a Riesz representation theorem, 
Theorem 7.4.1 in [5]; this is also known as the topology of weak convergence 
in the case of metrizable 51). The subset P fin (f2) of V(£l) consists of all prob- 
ability measures in V(£l) concentrated on a finite subset of Q. If f2 is finite, 
■p(f2) = V (0) can be identified with an (|f2| — l)-dimensional simplex (see 
below) in Euclidean space equipped with the standard Euclidean distance and 
topology. 

Theorem Q] will be a statement about the following perfect-information game 
involving three players: 

Continuous game 

Players: Sceptic, Forecaster, Reality 

Protocol: 

/Co := 1. 

FOR n — 1,2,...: 

Sceptic announces a function S n :Ox V{£1) — > K 

which is lower semicontinuous in the second argument 
and satisfies J Q S n {uj,p)p(duj) < for all p <E -p fin (f7). 

Forecaster announces p n 6 V(Sl). 

Reality announces lo„ G Q. 

K n := /C„_i + S n (u> n ,p n ). 

Winner: Forecaster wins if Sceptic's capital IC n stays bounded. 

(For p e V Gn (fl), the integral J n S n (to,p)p(dco) is interpreted as a sum, and so 
S n (u>,p) is not required to be measurable in u>.) 

Intuitively, on each round of the game Forecaster is asked to give a proba- 
bility forecast p„ for the outcome u> n to be chosen by Reality. Sceptic is testing 
the forecasts p n by gambling against them. Forecaster wins the game if Sceptic 
does not detect serious disagreement between Forecaster and Reality. 
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The continuous game is stated here in the form that makes Theorem [T] 
as strong as possible. In typical applications in prediction with expert ad- 
vice and algorithmic information theory, Sceptic's move S n (iv,p) is lower semi- 
continuous jointly in (lu,p) G f2 X V(£l) and measurable in ui; the condition 
J Q S n (uj , p)p(dui) < is required to hold for all p G T{$1). Furthermore, there 
is an important restriction imposed on Sceptic: he must choose S n so that his 
capital remains nonnegative (/C„ > 0) no matter how the other players move 
(in particular, the function S n must be bounded below). Theorem [TJ however, 
does not depend on these further assumptions. 

The following result was stated (in different terms) by Levin [S]. 

Theorem 1 Forecaster has a strategy in the continuous game that guarantees 

In other words, not only Sceptic does not detect serious disagreement between 
Forecaster and Reality, he does not detect any disagreement at all. 

We will reproduce Levin's original proof, as detailed by Gacs [6], Section 5; 
for a different proof (essentially a reference to Ky Fan's minimax theorem, pQ, 
Theorem 11.4) under stronger assumptions, see |14j . Section 3. 

A set v\, . . . , vm of points in a Euclidean (or topological vector) space is 
affinely independent if, for all real numbers Ai, . . . , Am, 

M M 

22 X m v m = and 2J A m = imply Ai = • • • = Am = 0. 

m—l m— 1 

The convex hull of such v±, . . . , %, denoted co(^i, . . . , wj,;), is called a simplex 
or, more fully, an [M — 1)- dimensional simplex. The proof of Theorem [T] will 
use the following result due to Knaster, Kuratowski, and Mazurkiewicz ([S]; see 
also [J, Theorem 11.2). 

KKM Theorem Let . . . , Fm be closed subsets of a simplex co(ui, . . . , 
Suppose that for all 1 < k < M and 1 < mi < • • • < mk < M we have 

) C F mi U---UF mk . 

Then Fx n • • • n F M + 0- 

Proof of Theorem [1} Fix a round n of the game and set S :— S n . For every 
uj G fl, let F^ be the closed set 

F u :={peV(n)\S(u,p) <0}. 

It suffices to show that for every finite set of points u>i , . . . , u>m we have 

F U1 n • • • n f Um ± 0. (l) 

Indeed, the compactness of ft implies the compactness of V(fi) (combine 
Alaoglu's theorem, Problem 9 in Section 6.1 of [3], with the weak* closeness 
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ofV(ti) in (7(0)', following from [3J, Theorems 7.1.5 and 2.6.3). Therefore, if 
every finite subset of the family {F^ \ uj £ ft} of closed sets has a non-empty 
intersection, then the whole family has a nonempty intersection, and any of the 
measures in this intersection can be taken as p n . 

To show ((T|), let V(uji, . . . , ujm) be the set of probability measures concen- 
trated on {ljx, ■ ■ ■ , ojm}- Hp £ ■ ■ ■ , the inequality J S(uj,p)p(duj) < 
implies S(uj m ,p) < for some m £ {1,...,M}. Hence P(wi,...,wm) C 
F Wl U • • • U F WM , and the same holds for every subset of the indices {1, . . . , M}. 
The KKM theorem now implies (Q]). I 

3 Randomized defensive forecasting 

Let V (V(fi)) be the set of all probability measures on V(fi) concentrated on 
a finite subset of V(fl). For each P e V Rn (P(n)), let D(P) C V(to) be the 
smallest finite set in V(Q) of P-probability one. 

Our result about randomized defensive forecasting concerns the following 
perfect-information game involving four players: 

Randomized game 

Players: Sceptic, Forecaster, Reality, Random Number Generator 
Protocol: 
/Co := 1. 
To := 1. 

FOR n — 1,2,...: 

Sceptic announces a function S n : ti X P(O) — > K 
which is continuous in the first argument u £ CI 
and satisfies f Q S n {u,p)p(duj) < for all p e P(O). 

Forecaster announces P n £ V (P(O)). 

Reality announces w n £ J7. 

Forecaster announces a function / n : P(O) — > M such that j-pr™ /ndP„ < 0. 
Random Number Generator announces p n G D(P„). 
AC„ := /C„_i + S n (uj n ,p n ). 

J~n ■ — 1 ~t~ /n (Pn ) * 

Restriction on Sceptic: Sceptic must choose S* n (continuous, and so Baire 
measurable, in its first argument) so that his capital remains nonnegative 
(JC n > 0) no matter how the other players move (in particular, the function 
S n must be bounded below). 

Restriction on Forecaster: Forecaster must choose his moves so that his 
capital remains nonnegative (T n > 0) no matter how the other players 
move. 

Winner: Forecaster wins if either (i) his capital T n tends to infinity or (ii) 
Sceptic's capital K n stays bounded. 
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(Since J v ^\ f n dP n = J D( -p \ fn^-Pn is a sum, its existence does not depend on 
the measurability of f n . However, by the Tietze-Urysohn theorem, Theorem 
2.1.8 in [4], /„ can be chosen continuous and, therefore, Baire measurable; the 
Tietze-Urysohn theorem is applicable since every compact is normal, [J], The- 
orem 3.1.9.) 

Forecaster is now allowed to randomize, and it is Random Number Gen- 
erator who picks the actual forecast p n from Forecaster's randomized forecast 
P n . As before, Sceptic is testing the forecasts p n by gambling against them. 
To make sure that Random Number Generator performs his duty of producing 
random- looking p n , Forecaster is allowed to gamble against Random Number 
Generator's choices. Forecaster wins the game if he either discredits Random 
Number Generator or Sceptic does not detect serious disagreement between the 
forecasts and the outcomes. 

In the case of finite Q, the only restriction on Sceptic's move S n is 
f n S n (uj,p)p(du) < 0, Vp € V{^i). We will see that in this case Theorem 
[2] will remain true even if /„ is required to be a linear function on the simplex 

The following is the randomized counterpart of Theorem [TJ 

Theorem 2 For any e > and any sequence Ai, A2, ■ ■ ■ of open covers of 
the outcome space £1, Forecaster has a strategy in the randomized game that 
guarantees: 

• fcn < (1 + e)T n for each n; 

• D(P„) lies completely in one element of A n ; 

• |D(P n )| < 

The last item, |D(P„)| < |f2|, is of interest only in the case of finite O: it holds 
trivially when fl is infinite. 

Before discussing the intuition behind Theorem [5] we restate the second 
item in a more intuitive form assuming that fl is finite and dist is the Euclidean 
distance on the simplex V(fl). (More generally, f2 can be assumed a compact 
metric space and dist be, e.g., the Prokhorov metric on P(£l); see, e.g., [2], 
Appendix III, Theorem 6.) 

Corollary 1 Suppose 51 is finite (or a metric compact). For any e > and 
any sequence e\, £2, ■ ■ ■ of positive real numbers, Forecaster has a strategy in the 
randomized game that guarantees: 

• fcn < (1 + e)T n for each n; 

• the diameter o/D(P„) is at most e n : 

diamD(P„) := sup dist(p, q) — max dist(p, q) < e„; 

p,gGD(P„) p,gGD(P„) 

• |D(P„)| < |fi|. 
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The condition fC n < (1 + e)T n says that Forecaster can guarantee T n > 1C n 
to any approximation required, i.e., every pound gained by Sceptic can be at- 
tributed to the poor performance of Random Number Generator. The condition 
diamD(P„) < e„ shows that already a tiny amount of randomization is suffi- 
cient; as already mentioned, a similar observation was made by Kakade and 
Foster [7]. 

Proof of Theorem [2| We will repeatedly use the fact that V(fl) is paracom- 
pact ( 4 , Theorem 5.1.1). The stronger condition that T(Vt) is compact will 
only be used in a reference to Theorem [TJ 

Fix a round n of the game. Let 5 > be a small constant (how small will 
be determined later). For each p 6 V(U) set 

A p := L G V{Sl) | J S n (L>,p)q(du>) < tfj ; (2) 

notice that p G A p and that A p is an open set. Let B be any open star refinement 
of A n (it exists by [3], Theorem 5.1.12, (i) and (hi)), let C be any locally finite 
open refinement of B (it exists by the definition of paracompactness), and let 
Bp be the intersection of A p with an arbitrary element of C containing p. Notice 
that the B p form an open cover of V(Vl). If Q, is finite, replace {-Bp} P 6-p(n) by its 
open shrinking of order \Q\ — 1 (it exists by the Dowker theorem, Theorem 7.2.4 
in [4], since f2 is normal, Theorem 3.1.9 in [4], and din^T^Sl)) = |f2| — 1, [4], 
Theorem 7.3.19); we will use the same notation {-Bp} p6 -p(Q) for the shrinking. 
Let {/s}ses be a locally finite partition of unity subordinated to the open cover 
{Bp}p£-p(n) Q4J, Theorem 5.1.9). For each s G S choose a p s G ^(Vi) such that 
{P I fs(p) > 0} C B Ps . Set, for lo G f> and p G V{9), 

S*{u,p) ■= ^2S n (w,p s )f s (p) 

ses 

(notice that only a finite number of addends are non-zero, so the sum is well- 
defined) . 

In the previous section we were considering Sceptic's moves S„(tu,p) lower 
semicontinuous in p and satisfying J n S n (uo,p)p(dLd) < for all p G 7 ?fill (Sl) . It 
is clear that S*(u>,p) is even continuous in p\ let us check that it almost satisfies 
J n S*{uj,p)p{duj) < for all p G P(fi). We have: 

/ S*(uj,p)p(duj) = I S n {uj,p s )f s (p)p(duj) 

U Si J it f- Q 

= J2 J S n (Lo :Ps )p(dcj)f s (p) < $fs(p)=5, (3) 

where S p is the finite set of all s for which f s {p) > 0; the inequality in ([3]) uses the 
fact that p G B Ps C A Ps and the definition ([2]). Therefore, J Q S (cu , p)p(dui) < 
for all p, where S := S* — S. Applying to S the argument given in the proof of 
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TheoremQ] we can see that there exists p* € V(iY) satisfying S(lu,p*) < 0, i.e., 
S*{uj,p*) < 5, for all lo G O. 

Make Forecaster select P„ concentrated on the p s with positive f s (p*) and 
assigning weight f s {p*) to each of these p s . This will ensure that P n is concen- 
trated on a finite subset, D(P„), of an element of A n and that |D(P„)| < |fi|. 

The rest of the proof proceeds similarly to the proof of Theorem 3 in [T5] . 
Let 5 be e2~ n or less. This will ensure 



for all u> € Q. Let Forecaster's strategy further tell him to use as his second 
move the function /„ given by 



for p G D(P„) and defined arbitrarily for p £ D(P„). The condition J f n dP n < 
is then guaranteed by 

It remains to check )C n < (1 + t)T n (this will also establish that T n is never 
negative) . This can be done by a formal calculation (as in the proof of Theorem 
3 in [T5]), but I prefer the following intuitive picture. We would like Forecaster 
to use f n (p) '■= S n (uj n ,p) — e2~ n (for p G D(P„)) as his second move; this would 
always keep his capital T n above K. n — e. To make sure that T n is never negative, 
Forecaster would have to start with initial capital J-q = 1 + e, which, moreover, 
would lead to T n > /C„, Vn; our protocol, however, requires J-q = 1. Therefore, 
Forecaster's strategy has to be scaled down to the initial capital 1, leading to 
|5]); T n > IC n becomes (1 + e)F n > IC n - (Scaling down a strategy to a smaller 
initial capital means that the player multiplies the strategy's moves by the same 
factor as he has multiplied the initial capital, thus assuring that the capital on 
succeeding rounds is also multiplied by this factor.) I 

Corollary 2 Forecaster has a winning strategy in the randomized game. 

Proof We are required to show that for every legal strategy S for Forecaster, 
we can construct another legal strategy S* such that whenever S's capital is 
unbounded, iS*'s tends to infinity. I will reproduce a simple construction (which 
I learned from Shen) given in [15] . the proof of Theorem 3. (For a more efficient, 
in certain respects, construction see [12] . Lemma 3.1; an even better construction 
has been recently devised by Vereshchagin and Shen.) 

We choose some number larger than 1, say 2. Starting, as the game requires, 
with initial capital 1 for Forecaster, we have him play S until its capital exceeds 
2. Then he sets aside 1 of this capital and continues with a rescaled version 
of S, scaled down to the reduced capital. When the capital again exceeds 2, 
he again sets aside 1, and so forth. The money set aside, which is part of the 
capital earned by this strategy, grows without bound. I 

Theorem [2] imposes a condition of continuity on Sceptic's move S n whereas 
Theorem [1] only requires lower semicontinuity (in a different argument). A 




(4) 
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natural question is whether we can relax the former condition. The key point 
in the proof of Theorem [2] where the continuity of S n in ui is used is the claim 
that the set ^ is open. This claim will still be true if S n is only required to be 
upper semicontinuous in uj, at least when fl is a metric compact. We did not 
pursue this generalization since it can be deduced from Theorem[5]as a corollary 
(Corollary [3] below) . 

Let us say that a real- valued function / on 17 is strongly upper semicontinuous 
if there is a monotonic sequence of bounded above real-valued functions /i > 
/2 > • • • on that converges to / everywhere. For metric compacts, this 
requirement coincides with upper semicontinuity ([1], Problems 1.7.15(c) and 
3.12.23(g)), but in general it is stronger (g], Problems 1.7.14(a), 1.7.15(c), and 
3.12.23(g)). 

Corollary 3 Theorem^ will continue to hold if the condition that S n (uj,p) be 
continuous in u> £ f2 in the randomized game is relaxed to the condition that 
S n (uj,p) be strongly upper semicontinuous in u> € O. 

Proof The proof proceeds similarly to the end of the proof of Theorem^ Let e' 
be a small positive constant (we will need (1 + e') 2 < 1 + e). Fix, for a moment, a 
round n of the game. By the monotone convergence theorem and the definition 
of strong upper semicontinuity, there exists a function S' n : ft x V(Q) — » R such 
that S' n > S n , S' n (uj 7 p) is continuous in ui 6 O, and J n S' n (u),p)p(duj) < e'2~" 
for all p £ V{£1). Theorem [2] is applicable to Sceptic's move 

SI', ■= ^7 (S'n - 

on round n, for each n = 1,2,..., and it asserts the existence of a strategy for 
Forecaster ensuring 

JCn < K < (1 + < (1 + t'fFn, 

where K! is the capital corresponding to the strategy 5" (formally, K' n := 1 + 
S™=i Si( w iiPi)) an( i m the capital corresponding to the strategy S" . I 

Theorem [2] is a general form of Theorem 5 in [15j (that theorem is not part 
of the journal version). This note is self-contained from the mathematical point 
of view, but for further motivation behind Theorem [2] the reader is referred to 

EE- 

4 Discussion 

This note assumes that the outcome space f2 is a compact. This assumption 
is not as restrictive as it seems since a wide range of topological spaces have 
compactifications that are still "nice" topological spaces (cf. [13] , the subsection 
on pp. 4-5). It appears that implications of this fact for prediction with expert 
advice (see, e.g., [14]) deserve to be explored. 
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